A memory access model for highly-threaded many-core architectures
نویسندگان
چکیده
منابع مشابه
Modeling Algorithm Performance on Highly-threaded Many-core Architectures
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Examples of Highly-threaded Many-core Architectures . . . . . . . . . . . . 4 1.2 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Methodology for Performance Modeling . . . . . . . ...
متن کاملMemory Optimization in Codelet Execution Model on Many-core Architectures
The upcoming exa-scale era requires a parallel program execution model capable of achieving scalability, productivity, energy efficiency, and resiliency. The codelet model is a fine-grained dataflow-inspired execution model which is the focus of several tera-scale and exa-scale studies such as DARPA’s UHPC, DOE’s X-Stack, and the European TERAFLUX projects. Current codelet implementations aim t...
متن کاملA Cross-Core Performance Model for Heterogeneous Many-Core Architectures
An accurate performance predictor to identify the most suitable core-architecture to execute each thread/workload in a heterogeneous many-core structure is proposed. The devised predictor is based on a linear regression model that considers several different parameters of the many-core processor architectures, including the cache size, issuewidth, re-order buffer size, load/store queues size, e...
متن کاملAdding shared memory parallelism to FLASH for many-core architectures
In this paper we discuss evolutionary changes to FLASH to enable enhanced applications to run efficiently on both the current generation BG/P and the next generation BG/Q. We motivate the need for change by discussing current FLASH applications and the challenges they are facing on today’s architectures. Our solution to current challenges with a view to the next generation is mixed-mode MPI+Ope...
متن کاملA polyphase filter for many-core architectures
In this article we discuss our implementation of a polyphase filter for real-time data processing in radio astronomy. The polyphase filter is a standard tool in digital signal processing and as such a well established algorithm. We describe in detail our implementation of the polyphase filter algorithm and its behaviour on three generations of NVIDIA GPU cards (Fermi, Kepler, Maxwell), on the I...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Future Generation Computer Systems
سال: 2014
ISSN: 0167-739X
DOI: 10.1016/j.future.2013.06.020